The CNVrd2 package: measurement of copy number at complex loci using high-throughput sequencing data

نویسندگان

  • Hoang T. Nguyen
  • Tony R. Merriman
  • Michael A. Black
چکیده

Recent advances in high-throughout sequencing technologies have made it possible to accurately assign copy number (CN) at CN variable loci. However, current analytic methods often perform poorly in regions in which complex CN variation is observed. Here we report the development of a read depth-based approach, CNVrd2, for investigation of CN variation using high-throughput sequencing data. This methodology was developed using data from the 1000 Genomes Project from the CCL3L1 locus, and tested using data from the DEFB103A locus. In both cases, samples were selected for which paralog ratio test data were also available for comparison. The CNVrd2 method first uses observed read-count ratios to refine segmentation results in one population. Then a linear regression model is applied to adjust the results across multiple populations, in combination with a Bayesian normal mixture model to cluster segmentation scores into groups for individual CN counts. The performance of CNVrd2 was compared to that of two other read depth-based methods (CNVnator, cn.mops) at the CCL3L1 and DEFB103A loci. The highest concordance with the paralog ratio test method was observed for CNVrd2 (77.8/90.4% for CNVrd2, 36.7/4.8% for cn.mops and 7.2/1% for CNVnator at CCL3L1 and DEF103A). CNVrd2 is available as an R package as part of the Bioconductor project: http://www.bioconductor.org/packages/release/bioc/html/CNVrd2.html.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

seqCNA: A Package for Copy Number Analysis of High-Throughput Sequencing Cancer DNA

This document guides the reader through the use of the seqCNA package. The aim of the package is to process high-throughput sequencing copy number data coming from tumoural samples, departing from SAM or BAM aligned reads up to the final stage of calling the copy numbers. It includes, among other features, an integrated summarization method, several filters based on a range of genomic and read-...

متن کامل

QuicK-mer: A rapid paralog sensitive CNV detection pipeline

QuicK-mer is a unified pipeline for estimating genome copy-number from high-throughput Illumina sequencing data. QuicK-mer utilizes the Jellyfish application to efficiently tabulate counts of predefined sets of k-mers. The program performs GC-normalization using defined control regions and reports paralog-specific estimates of copy-number suitable for downstream analysis. The package is freely ...

متن کامل

CNVrd2: A package for measuring gene copy number, identifying SNPs tagging copy number variants, and detecting copy number polymorphic genomic regions

2 Getting started 3 2.1 Measuring FCGR3B CN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1.1 CNVrd2 object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1.2 Count reads in windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1.3 Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.1.4 Obtain co...

متن کامل

CNAnorm: A package to detect Copy Number Alterations (CNA) from sequencing data

CNAnorm is a package for the analysis of Copy Number Alteration (CNA) of tumour samples using low coverage (around 0.01 0.5X) high throughput sequencing[1]. In particular, CNAnorm aims to perform a meaningful normalisation of the sample by estimation of the underlying tumour’s ploidy. CNAnorm allows both a fully automated as well as an interactive approach to the normalisation step. It provides...

متن کامل

biomvRCNS : Copy Number study and Segmentation for multivariate biological data

With high throughput experiments like tiling array and NGS, researchers are looking for continuous homogeneous segments or signal peaks, which would represent chromatin states, methylation ratio, transcripts or genome regions of deletion and amplification. While in a normal experimental set-up, these profiles would be generated for multiple samples or conditions with replicates. In the package ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2014